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INSTRUCTION SETS FOR PROCESSORS 



The present invention relates to instruction sets 
for processors. In particular, the present invention 
relates to processors having two or more different 
instruction sets. The present invention also relates 
to methods of automatically encoding instructions for 
such processors . 

A high-performance processor is generally required 
to have an instruction set which can meet two 
requirements: compact code (so that the amount of 
memory required to store the processor's program is 
desirably small), and a rich set of operations and 
operands- Such requirements are particularly important 
in the case of an embedded processor, i.e. a processor 
embedded in a system such as in a mobile communications 
device. In this case, high code or instruction density 
is of critical importance because of the limited 
resources of the system, for example in terms of 
available program memory. 

However, these two requirements tend to conflict 
with one another and are difficult to achieve in a 
single unified instruction set, as compact code 
involves a minimal encoding for each of the most 
frequent operations (eliminating the less frequent 
operations from the instruction set) whereas a rich set 
of operations and operands requires an orthogonal 32- 
bit reduced instruction set. Consequently, in a 
processor having a pre-existing 32 -bit instruction set 
it has been proposed to add a compact 16 -bit 
instruction set which provides the. most commonly-used 
functions and/or access to a limited subset of register 
operands . 

Fig. 1 of the accompanying drawings shows 
schematically the instruction sets in such a processor, 
Internally, at the hardware level, the processor has a 



set of 32 -bit instructions ISint- Externally, the 
processor has two instruction sets IS^ and IS2. The 
first instruction set ISi is made up of the same 32 -bit 
instructions as the internal instruction set ISjnt- The 
second instruction IS2 is made up of 16 -bit instructions 
and the processor contains instruction translation 
circuitry 200 for translating each 16 -bit instruction 
of the external instruction set IS2 into a corresponding 
one of the 3 2 -bit instructions of the internal 
instruction set ISint- 

An embedded processor may also be a very long 
instruction word (VLIW) processor capable of executing 
VLIW instructions. The most important additional 
feature of a VLIW processor is Instruction-Level 
Parallelism (ISP), i.e. its ability to issue two or 
more operations simultaneously when executing VLIW 
instructions. 

In such a VLIW processor an instruction issuing 
unit has a plurality of issue slots, each connected 
operatively to a different execution unit. It is 
typical for a VLIW processor that issues two or more 
instructions per processing cycle to encode each 
instruction in a different format (or group of formats) 
depending on the issue slot from which the instruction 
will be issued. The instructions that will be issued 
in the same processing cycle are combined together in a 
VLIW packet or parcel. The position of an instruction 
in the VLIW parcel determines the sub- set of formats in 
which that instruction may be encoded. In this way, 
formats for instructions destined for different 
positions within the VLIW parcel can use identical 
encodings without introducing ambiguity. 

In practice, empirical observation suggests that 
9 0% or more of the instructions within a program are 
executed so infrequently that they make up 10% or less 
of the execution time. Naturally, the remaining 10% of 
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the instructions occupy 90% of the execution time. 
Furthermore, it is often the case that the 
infrequently-executed parts of a program will not be 
able to make effective use of the processor's ability 
5 to issue two or more instructions simultaneously. If 

such parts of the program were encoded using a VLIW 
instruction set, a large proportion of the instructions 
would be "no operation" (NOP) instructions inserted in 
the program by the compiler simply to pad out the VLIW 
. .10 parcels when consecutive instructions cannot appear in 

the same VLIW parcel because the result of one 
instruction is used by the next. It follows that, for 
parts of a program where no effective advantage can be 
taken of the ability to issue instructions in parallel, 

15 or where any performance gain from that ability will 

have little impact anyway, it is desirable to encode 
the program to achieve maximum code density (i.e. using 
the smallest possible number of bits) . 

Accordingly, it is desirable to provide a VLIW 

20 processor with a compact -format instruction set, so as 

to combine the instruction-level parallelism of VLIW 
architecture with the compact code "footprint" of a 
tightly-encoded instruction set such as a 16-bit 
instruction set . 
f 25 In the previously-proposed processor discussed 

above with reference to Fig, 1, the compact instruction 
set was added after the design of an original 3 2 -bit 
instruction set, with the result that the translation 
from the 16 -bit instructions into 32 -bit instructions 

3 0 is undesirably complex and slow. 

It is therefore also desirable to design the 
instruction- set foirmats and encodings in such a way 
that the translation from each external -instruction 
foirmat (e.g. at least one VLIW format, and at least one 

3 5 compact format) into a form that can be executed 

directly by hardware, can be achieved more efficiently. 
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According to a first aspect of the present 
invention there is provided a processor having: 
respective first and second external instruction 
formats in which instructions are received by the 
5 processor, each instruction having an opcode which 

specifies an operation to be executed, and each 
external format having one or more preselected opcode 
bits in which the opcode appears; an. internal 
instruction format into which instructions in the 
10 external formats are translated prior to execution of 

the operations; wherein: the operations include a first 
operation specifiable in both said first and second 
external formats, and a second operation specifiable in 
said second external format; said first and second 
15 operations have distinct opcodes in said second 

external format; and in each said preselected opcode 
bit which the first and second external formats have in 
common, the opcodes of the first operation in the two 
external formats are identical. 
20 According to a second aspect of the present 

invention there are provided processor instruction 
encodings having: respective first and second external 
instruction formats in which the instructions are 
received by a processor, each instruction having an 
f 25 opcode which specifies an operation to be executed, and 

each external format having one or more preselected 
opcode bits in which the opcode appears; an internal 
instruction format into which the processor 
instructions in the external formats are translated 
30 prior to execution of the operations; wherein: a first 

operation executable by the processor is specifiable in 
both said first and second external formats, and a 
second operation executable by the processor is 
specifiable in said second external format; said first 
35 and second operations have distinct opcodes in said 

second external format; and in each said preselected 



opcode bit which the first and second external formats 
have in common, the opcodes of the first operation in 
the two external formats are identical. 

According to a third aspect of the present 
invention there is provided a method of encoding 
processor instructions for a processor having 
respective first and second external instruction 
formats in which instructions are received by the 
processor, each instruction having an opcode which 
specifies an operation to be executed, and each 
external format having one or more preselected opcode 
bits in which the opcode appears, the processor also 
having an internal instruction format into which 
instructions in the external formats are translated 
prior to execution of the operations, and the . 
operations include a first operation specifiable in 
both said first and second external formats, and a 
second operation specifiable in said second external 
format, said method comprising the steps of: encoding 
said first and second operations with distinct opcodes 
in said second external format; and encoding the 
opcodes of the first operation in said first and second 
external formats so that, in each said preselected 
opcode bit which the first and second external formats 
have in common, the opcodes of the first operation in 
the two external formats are identical. 

According to a fourth aspect of the present 
invention there is provided a method of encoding 
instructions for a processor having two or more 
external instruction formats and one or more internal 
instruction formats, the method comprising: a) 
selecting initial encoding parameters including a 
number of effective opcode bits in each external and 
internal format and a set of mapping functions, each 
said mapping function serving to translate an opcode 
specified by the said opcode bits in one of the 



external formats to an opcode specified by the said 
opcode bits in the, or in one of the, internal formats; 
(b) allocating each operation executable by the 
processor an opcode distinct from that allocated to 
each other operation in each external and internal 
format in which the operation is specifiable, the 
allocated opcodes being such that each relevant mapping 
function translates such an external -format opcode 
allocated to the operation into such an internal -format 
opcode allocated to the operation and such that all the 
internal -format opcodes allocated to the operation have 
the same effective opcode bits; and c) if in step (b) 
no opcode is available for allocation in each 
specifiable format for every one of the said 
operations, determining which of the said encoding 
parameters is constraining the allocation in step (b) , 
relaxing the constraining parameter, and then repeating 
step (b) . 

According to a fifth aspect of the present 
invention there is provided a computer program which, 
when executed, encodes instructions for a processor 
having two or more external instruction formats and one 
or more internal instruction formats, the computer 
program comprising code portions for: (a) selecting 
initial encoding parameters including a number of 
effective opcode bits in each external and internal 
format and a set of mapping functions, each said 
mapping function serving to translate an opcode 
specified by the said opcode bits in one of the 
external formats to an opcode specified by the said 
opcode bits in the, or in one of the, internal formats; 
(b) allocating each operation executable by the 
processor an opcode distinct from that allocated to 
each other operation in each external and internal 
format in which the operation is specifiable, the 
allocated opcodes being such that each relevant mapping 



function translates such an external -format opcode 
allocated to the operation into such an internal -format 
opcode allocated to the operation and such that all the 
internal -format opcodes allocated to the operation have 
the same effective opcode bits; and (c) if in step (b) 
no opcode is available for allocation in each 
specifiable format for every one of the said 
operations, determining which of the said encoding 
parameters is constraining the allocation in step (b) , 
relaxing the constraining parameter, and then repeating 
step (b) . 

Reference will now be made, by way of example, to 
the accompanying drawings, in which: 

Fig. 1, discussed hereinbefore, is a schematic 
diagram for use in explaining a previously-proposed 
processor having an additional compact instruction set; 

Fig. 2 shows parts of a processor embodying the 
present invention; 

Fig. 3(A) shows a schematic diagram for use in 
explaining previously- considered instruction encodings; 

Fig. 3 (B) shows a schematic diagram corresponding 
to Fig. 3 (A) for use in explaining congruent 
instruction encodings; 

Figs. 4(A) and 4(B) present a flowchart for use in 
explaining a method of encoding instructions embodying 
the present invention; 

Fig. 5 shows a schematic view of external and 
internal instruction formats in a specific example; 

Fig. 6 presents a table illustrating which 
operations are specifiable in each external and 
internal format in the Fig. 5 specific example; 

Figs. 7(A) to 7(H) present schematic diagrams for 
use in explaining different stages of an automatic 
encoding method applied to the Fig. 5 specific example; 
and 

Fig. 8 shows the final instruction encodings 



achieved by the method of Fig. 7, 

Fig, 2 shows parts of a processor embodying the 
present invention. In this example, the. processor is a 
very long instruction word (VLIW) processor. The 
processor 1 includes an instruction issuing unit 10, a 
schedule storage unit 12, respective first, second and 
third VLIW translation units 4, 6 and 8, a scalar 
translation unit 9, respective first, second and third 
execution units 14, 16 and 18, and a register file 20. 

The instruction issuing unit 10 has three issue 
slots ISl, IS2 and IS3 connected respectively to the 
first, second and third translation units 4, 6 and 8. 
Respective outputs of the first, second and third 
translation units 4, 6 and 8 are connected to 
respective first inputs of the first, second and third 
execution units 14, 16 and 18 respectively. 

The instruction issuing unit 10 has a further 
output SC connected to the scalar translation unit 9 . 
An output of the scalar translation unit 9 is connected 
in common to a second input of each execution unit 14 , 
16 and 18 . 

A first bus 22 connects all three execution unit 
14, 16 and 18 to the register file 20. A second bus 24 
connects the first and second units 14 arid 16 (but not 
the third execution unit 18 in this embodiment) to a 
memory 2 6 which, in this example, is an external random 
access memoiry (RAM) device. The memory 26 could 
alternatively be a RAM internal to the processor 1. 

Incidentally, although Fig. 1 shows shared buses 
22 and 24 connecting the execution units to the 
register file 20 and memory 26, it will be appreciated 
that alternatively each execution unit could have its 
own independent connection to the register file and 
memory . 

The processor 1 performs a series of processing 
cycles. The processor may operate selectively in two 



m 
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modes : a scalar mode and a VLIW mode. 

In scalar mode the processor executes instructions 
from a particular instruction set (which may or may not 
be distinct from the VLIW instruction set) . In this 
5 mode instructions are not issued at the issue slots ISl 

to IS3. 

In VLIW mode, on the other hand, the instruction 
issuing unit 10 can issue up to 3 instructions in 
parallel per cycle at the 3 issue slots ISl to IS3, 

10 i.e. the full instruction issue width is exploited. 

Scalar-mode instructions and VLIW-mode 
instructions are both stored together in the schedule 
storage unit 12. The instructions are issued according 
to an instruction schedule stored in the schedule 

15 storage unit. 

As explained later in more detail, instructions in 
the instruction schedule are written in at least two 
different external formats, including at least one 
format belonging to a scalar instruction set of the 

20 processor (hereinafter a "scalar format") and at least 

one format belonging to a VLIW instruction set of the 
processor (hereinafter a "VLIW format") . In practice, 
there may be two or more scalar formats and two or more 
VLIW formats. In the case of the VLIW formats it is 
/ 25 possible to have different formats for different issue 

slots, although a format may be shared by two or more 
issue slots. 

On the other hand, within the processor each 
execution unit executes instructions in at least one 

3 0 internal format. Accordingly, each execution unit 14, 

16 and 18 is provided with a translation unit 4, 6 or 8 
which translates an instruction in one of the external 
VLIW formats into the (or, if more than one, the 
appropriate) internal format required by the execution 

3 5 unit concerned. Similarly, the scalar translation unit 

9 is provided for translating an instruction in one of 



the external scalar formats into the (appropriate) 
internal format required by the execution units. 

After translation by the relevant translation unit 
4, 6, 8 or 9 the instructions issued by the instructing 
issuing unit 10 at the different issue slots or at the 
scalar instruction output SC are executed by the 
corresponding execution units 14, 16 and 18. Each of 
the execution units may be designed to execute more 
than one instruction at the same time, so that 
execution of a new instruction can be initiated prior 
to completion of execution of a previous instruction 
issued to the execution unit concerned. 

To execute instructions, each execution unit 14, 
16 and 18 has access to the register file 20 via the 
first bus 22 . Values held in registers contained in 
the register file 20 can therefore be read and written 
by the execution units 14, 16 and 18. Also, the first 
and second execution units 14 and 16 have access via 
the second bus 24 to the external memory 26 so as to 
enable values stored in memory locations of the 
external memory 26 to be read and written as well. The 
third execution unit 18 does not have access to the 
external memory 2 6 and so can only manipulate values 
contained in the register file 20 in this embodiment. 

As outlined above, the architecture of the Fig. 2 
processor defines a compact (e.g. 16-bit) instruction 
set and a wider (e.g. 32-bit) VLIW instruction set. 
There are at least two of these wider instructions in 
each VLIW parcel. Instructions belonging to the 
compact instruction set and the VLIW instruction set 
are encoded using external formats. 

There is also at least one internal instruction 
format to which all instructions in an external format 
are translated during execution. 

Each VLIW parcel is made up of two or more 
instructions at different positions (slots) within the 
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parcel . Each slot within a VLIW parcel may contain an 
instruction encoded in one of several external VLIW 
formats. At least some fundamental operations provided 
by the processor (e.g. add, subtract or multiply) may 
need to be available in two or more, or possibly all, 
of the instruction slots of a VLIW parcel. In this 
case, the same fundamental operation may be encoded in 
a different external format per instruction slot. Of 
course, when the instructions in these different 
external formats are translated they must all have the 
same operation code (opcode) within the same group of 
bits in the or each internal format. 

A fundamental operation may also need to be 
available using two or more scalar instructions, for 
example where the same fundamental operation is 
performed using two or more different types of operand 
or operand addressing. In this case, each of the two 
or more scalar instructions relating to the same 
fundamental operation must be encoded using a different 
scalar format and must translate to a different 
internal format. Again, when translated into an 
internal format, these two or more scalar instructions 
must have the same opcode as all VLIW- format 
instructions for the same operation which translate to 
the same internal format. Typically, the scalar 
instruction set will be a sub-set of the full (VLIW) 
instruction set, allowing a more compact encoding of 
the external scalar formats. 

The task of designing formats and assigning codes 
to each operation in each foarmat is complicated by the 
fact an operation X may appear in external formats 
and F2, whereas another operation Y may appear in the 
external format F2 and in a further external format F3, 
This means that the design of the external formats F^, 
F2 and F3, and the choice of opcodes for operations X 
and Y, are interdependent. Fig. 3(A) shows a simple 



example of previously-considered instruction encodings. 
In this example, an add operation appears both in 
external formats and F2 . The add operation in both 
formats Fi and F2 is mapped to the same internal format 
Gi. A load instruction appears in the external format 
F2 and in the further external format F3. The load 
operation in both formats is translated into the same 
internal format G2 . 

As shown in Fig. 3(A), in the different external 
formats F^ to F3, different sets of bits are used for 
specifying the opcode, i.e. the opcode fields are' 
different. In the format Fi the four bits from bit i+1 
to bit i+4 are used to specify the opcode. In format 
F2, the three bits from bit i+1 to bit i+3 are used to 
specify the opcode. In format F3, the four bits from 
bit i to i+3 are used to specify the opcode. The 
opcode field for F2 may be shorter than for Fi and F3 
because there are less operations available in F2, for 
example . 

In Fig. 3(A) the external formats Fi and F2 have 
the bits i+1 to i+3 in common as opcode bits. For the 
add operation in format Fi and the load operation in 
format F2 these common bits i+1 to i+3 are the same, 
even though the operations are different. This 
complicates the translation process. For example, in 
internal format Gi the add operation may have the opcode 
"1011". The add operation in format F2 can be 
translated into this internal -format opcode simply by 
selecting "101" from F2 and appending a "1". However, 
to translate the add operation in format Fi into this 
internal -format code it is not possible to use a simple 
selection operation. In this case it may be necessary 
to examine all opcode bits i+1 to i+4 in the external 
format Fi and match uniquely the pattern of bits 
("1101") which identifies the add operation in format 
Fi. Anything short of this full examination might not 



-13- 

distinguish it from another operation in Fi . 

However, if it could be guaranteed that: 

(i) the opcodes for "add" and "load" in format F2 
are distinct, and the same is true for any other pair 
of operations which appear together in the same format 
F2 as well as in at least one other format; and 

(ii) every operation that appears in two or more 
external formats (i.e. the "add" operation and any 
other which appears in Fi and F2, and the "load" 
operation and any other which appears in F2 and F3) is 
identically coded in all common opcode bits in all 
those fonnats in which it appears; 

then the translation process can be independent of 
the opcodes themselves and can rely only on discovering 
the external format (and, if there is more than one 
internal format, the target internal format) of each 
instruction. Instruction encodings which have this 
property are referred to herein as "congruent" 
instruction encodings . 

In Fig. 3(B) the add and load operations of Fig. 
3 (A) have been allocated congruent instruction 
encodings. It can be observed that the opcodes 
assigned to the add instruction ("1011" in format F^ and 
"101" in format F2) are identical in the three opcode 
bits that are in common for the two formats Fi and F2 
("101") . 

Similarly, in the case of the load operation 
appearing in formats F2 and F3, the three opcode bits 
that are in common for formats F2 and F3 are identical 
("Oil") in F2 and F3 . 

Thus, the instruction encodings in Fig, 3 are 
congruent. This means that, the translation operation 
performed by the translation unit can be a simple bit- 
selection operation, for example to select some or all 
of the bits from i + 1 to i+4 in the case of translation 
from external format Fi to internal format G^, selecting 
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some or all of the three bits from i + 1 to i+3 in the 
case of translation from external format F2 to either 
internal format Gi or G2, and selecting some or all of 
the four bits from i to i+3 when translating from 
external format F3 to internal format G2 . The 
particular selection of bits required for a given 
translation can then be determined simply by 
identifying the external format and target internal 
format. The identification of the external format can 
be made by examining ID bits in the external formats, 
for example the bits labelled to F3 in Fig. 3 (B) . 

The task of designing instruction formats and 
opcodes having the property of congruence is not 
difficult in the simple case illustrated in Fig. 3(B) 
in which only two operations are considered. However, 
when there are many operations in different external 
formats which also appear in different internal formats 
the task of designing formats and assigning opcodes 
becomes very difficult. For example, a processor may 
have approximately 32 to 128 instructions in its scalar 
instruction set, 32 to 128 (or possibly double that) 
instructions in its VLIW instruction set, and perhaps 3 
to 6 different external formats and 4 to 6 different 
internal formats . 

This has meant that heretofore the translation 
units used to carry out the translations have been 
undesirably complex, leading to propagation delays and 
excessive power consumption in previously-considered 
processors. 

Next, a method will be described for designing 
automatically formats, opcodes and translations for 
achieving congruent instruction encodings. 

In order to describe this method for determining 
opcode fields within instruction formats and deriving 
congruent encodings in those formats let us begin by 
defining the terms we shall use. 
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W= u Gj 

be the set of all internal instructions, encoded in N 
internal formats . 

Each internal format Gj is a proper subset of W, 
and comprises a set of internal instructions defined by 
the processor that is being implemented. If y is an 
instruction encoded in format Gj, then the opcode for y 
is given by function (y) which selects a sub-field 
containing a.^ bits from the instruction format G^. 

Let Fi denote an external instruction format, where 
i e [1, M] . If X is an instruction encoded in format 
Fi, then the code for x is given by the function fi (x) 
which selects a sub- field containing bits from the 
instruction . 

Each internal instruction is represented in memory 
by one or more external instruction formats. Where an 
instruction is represented in two or more external 
formats, each variant must translate to the same 
internal opcode. These variants typically perform the 
same function, though the types and representation of 
their operands may differ. 

The present explanation is concerned with the 
process by which opcode field widths are determined, 
and the process by which operation codes are assigned 
in each format- The encoding of operands is also 
important, but is independent of the issue of opcode 
assignment and is therefore not addressed here. 

A translation from external format to internal 
format Gj requires a mapping function m^^^ which maps the 
bits of opcode from F^ to the bits of opcode in Gj . 
For the purposes of simplicity in implementation and 
tractability in design the mappings are preferably bit 
selections or permutations. In this explanation it 



will also be assumed that there is only one mapping 
function for translating between any pair of external 
and internal formats. 

The instruction set architecture of the processor 
defines for each internal instruction y an associated 
set of translations, Ty, where each translation is a 
pair (i, j) identifying an external format as the source 
of the translation and an internal format as the 
destination of the translation. For each translation 
there must exist a mapping function m^, j. Hence: 

Ty={{ij)iy^Gjy{xsFi)^{y=m,,j{^))} 

... (eq 1) 

Each format, whether internal or external, has a 
cardinality determined by the number of opcodes within 
the format. The cardinality of is written jFj, and 
hence the sizes of the opcode fields in external and 
internal formats must satisfy the following 
inequalities : 

aj ^log2(|Gyl) 
b, Slog^dF.I) 

... (eq 2) 

Each internal format Gj therefore defines opcodes 
in the range |0,2^^), and each external format defines 
opcodes in the range \0,2^^), At any point during the 
method contains the set of opcodes available to be 
allocated to operations in internal format Gj . 
Similarly, Rj^ contains the set of opcodes available to 
be allocated to operations in external format F^ . 

The problem now consists of determining- an unique 



opcode for each instruction y E N, and determining 
suitable selection- or permutation-based mapping 
functions for each translation defined in the 
instruction set architecture. One preferred embodiment 
of the method can now be expressed in pseudo-code, 
using the terminology introduced above, as shown in the 
flowchart of Figs. 4(A) and 4(B). 

Each mapping function in^^j. initially maps a chosen 
number bi of effective opcode bits of the external 
format to a chosen number sl^ of effective opcode bits 
of the internal format Gj . This can map no more than q 
= min(aj^ b^) bits from external format F^ to a^ bits in 
internal format Gj , setting any undefined bits in aj to 
zero. For simplicity, it will be assumed in this 
preferred embodiment that each mapping function 
involves selecting all bits of the external -format 
opcode to be some or all of the bits of the internal - 
format opcode after translation. Other mapping 
functions can be used in other embodiments of the 
invention, for example mapping functions involving 
permutations , 

The method begins in step SI by first computing 
the minimum possible number aj or b^ of opcode bits that 
could theoretically encode the number of instructions 
in each external format and each internal format. This 
minimum possible number a^ or bi is used as an initial 
number of effective opcode bits for the format 
concerned. 

In step S2, a new series of iterations is started 
(as explained later, several series may be required in 
a practical situation) . Firstly, for each internal 
format Gj , a set Qj of available opcodes is formed, made 
up of all possible opcodes definable by the a^ bits. 
Similarly, for each external format Fi, a set Ri of 
available opcodes is assigned, made up of all possible 
opcodes definable by the bi bits. As explained later. 
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each available opcode may have a working number of bits 
greater than the computed minimum possible number aj or 
bi of opcode bits. For example, the working number for 
all available opcodes in all sets Qj and may be set 
equal to the highest computed minimum possible number aj 
or bi . 

Step S3 involves iterating through all operations 
in the internal formats and determining their opcodes 
in each external format where they occur. 

During each series of iterations, steps S4 to S9 
are performed per iteration. One fundamental operation 
is considered per iteration. In step S4, for the 
considered operation, the method examines the pair of 
sets Ri and for the external format and internal 
format of each mapping function needed to translate the 
considered operation, and identifies as a mutual set ht 
any members the two sets of the pair have in common. 
In step S5 a set H of common members of all the mutual 
sets ht for all the needed mapping functions is formed. 
If the result is an empty set in step S6, then no 
allowable mapping is found and the method goes to step 
Sll where the constraints are relaxed. If H contains 
at least one common opcode, step S7 selects the or one 
of the common opcodes in H. 

Then in step S8 the selected opcode is removed 
from each set Ri and Qj for the external and internal 
formats in which the considered operation appears, i.e. 
the sets examined in step S4 . 

The method terminates when it is determined in 
step S9 that the method has successfully allocated 
opcodes to all operations in all the required external 
and internal formats . 

The method is guaranteed to terminate because the 
back-tracking process in step Sll successively relaxes 
the encoding constraints until there are as many opcode 
bits as are needed to find a congruent assignment of 



codes . 

In addition to selecting bits from the external 
format F^, the mapping function may also permute the 
bits. For example, the order of the bits may be 
reversed by the mapping function. Such permutations 
can be used when the number of mapped bits reaches q, 
where g-=min (aj, bi) . 

If p = maxiaj, bi) , then the total number of 
possible permutations is p!/{p-g)l. Hence, for large 
instruction sets, the number of possible permutations 
could be very large. In practice, however, it is 
typical for p to be about 5 and q to be about 3 . This 
means a maximum of 6 0 different permutation functions 
for each mapping. Typically one might expect there to 
be five different mappings, leading to a total of 60^ 
possible sets of mapping functions to consider on each 
iteration of the method defined by steps S4 to S9 (i.e. 
778 million possibilities) . This is within the 
capabilities of a modern computer to enumerate and 
evaluate automatically. 

For larger field widths the number of possible 
permutations grows intractably large. However, it is 
still possible to operate the method successfully in 
this case by restricting the class of permutations that 
will be searched. For example, there are nCn+l)/2 
possible permutations of n-bit field defined by - 
swapping arbitrary pairs of bits. By choosing such a 
restriction on the possible permutations to be examined 
by the method the running time of the method could be 
constrained to be polynomial in n. 

Next, operation of the method described with 
reference to Figs. 4(A) and 4(B) will be illustrated 
with reference to a specific example. In this example, 
a VLIW processor, for example a processor generally in 
accordance with Fig. 2, has the capability to issue two 
instructions simultaneously from issue slots A and B 



respectively. 

Referring to Fig. 5, it can be seen that the 
external VLIW formats allowed for instructions to be 
issued from issue^^lot A include . first and second 
external VLIW formats Fi and F2. The opcode bits in 
external format Fi are denoted by in Fig. 5, and the 
opcode bits in format F2 are denoted by C2 . 

In the case of instructions to be issued from 
issue slot two external VLIW formats are also 
available, one of them is the same external format F2 as 
available at issue slot A, and the other is a third 
external VLIW format F3. The opcode bits in foirmat F3 
are denoted by C3 in Fig. 5. 

In addition, the processor in this example is 
capable of operating in a scalar mode to execute 
instructions in one of two different 16-bit scalar 
external formats F4 and F5. The opcode bits in format 
F4 are denoted by C4 in Fig. 5, and the opcode bits in 
format F5 are denoted by C5 . 

The processor in this example also has two 
internal formats Gi and G2 . The opcode bits in the 
internal format Gi are denoted by in Fig, 5, and the 
opcode bits in internal format G2 are denoted by Cq. 
Each scalar instruction translates into a single 
operation in one or both of the internal formats Gi and 
G2, encoded in either the or Cb field. 

As also shown schematically in Fig. 5, the 
processor has three translation units, 30, 32 and 34. 
The translation unit 30 corresponds to issue slot A and 
is operable to translate opcode bits Ci in external 
format Fi or opcode bits C2 in external format F2 into 
either opcode bits in internal format G^ or opcode 
bits Cb in internal format G2. 

Similarly, the translation unit 32 corresponds to 
issue slot B and is operable to translate opcode bits C2 
in external format F2 or opcode bits C3 in external 
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format F3 into opcode bits in internal format Gi or 
opcode bits Cq in internal format Gj, 

The translation unit 34 corresponds to the scalar 
instructions and is operable to translate either opcode 
5 bits C4 in external format F4 or opcode bits C5 in 

external format F5 into opcode bits in internal 
format G^ or opcode bits Cb in internal format G2 . 

It will be appreciated that the translation units 
3 0 and 3 2 in Fig. 5 correspond to the translation units 
^ TO 4, 6 and 8 in Fig. 2, and that the translation unit 34 

in Fig. 5 corresponds to the translation unit 9 in Fig. 
2 . 

Referring now to Fig. 6, the processor in the 
present example has a small set of seven fundamental 

15 operations: an addition operation add, a logical OR 

operation or, a multiply operation mul, a load 
immediate operation li, a subtraction operation sub, a 
return from VLIW-mode operation rv and a division 
operation div. The table presented in Fig. 6 lists 

20 these seven fundamental operations in the first (left- 

hand) column. The second column in Fig. 6 indicates in 
which internal formats the operation concerned is 
permitted to appear. The add, or, mul, li and sub 
instructions are permitted to appear in both internal 
- 25 formats G^ and G2 and so have "Gl" and "G2" rows, but 

the rv and div instructions are only permitted to 
appear in internal format G2 and so have no "Gl" row. 

The remaining six columns in Fig. 6 relate to the 
five external instruction formats Fi to F5. The 

3 0 external format F2 has two columns allocated to it in 

this case, as this format is allowed at both issue slot 
A and issue slot B. 

Each cell in one of the six external -format 
columns corresponds to an instruction. Some of the 

3 5 cells are shaded whilst others are blank. An 

instruction I in a cell at row Gj and F^ must be 



-22- 



10 



15 



20 



^ 25 



30 



represented in external format Fi and must be translated 
to internal format Gj if its cell is shaded. If the 
cell is not shaded then the instruction I concerned is 
not present in external format Fi. Take, for example, 
the cell denoted by an asterisk in Fig. 6. This cell 
is at row Gi for the or instruction, and at column Fi . 
The shading of the cell indicates that the or 
instruction is present in external format Fi and 
internal format Gi, requiring that opcodes for the or 
operation are appropriately chosen in both formats and 
that a translation exists for the or instruction 
between these two formats. 

The algorithm described previously with reference 
to Figs. 4(A) and 4(B) will now be applied to the 
present example of Figs. 5 and 6 to determine the 
opcodes, the opcode field widths in each format, and 
the mapping functions (translations) between formats. 

The set W of fundamental operations in this 
example can be written as: 



The number N of internal formats is 2 (Gi and G2) , 
and the number M of external formats is 5 (Fi to F5) 

Looking at Fig. 6, for each external format Fi a 
mapping function mi,j is required if, for any operation, 
there is a shaded cell in row Gj . For example, taking 
the external format Fi, it can be seen that a mapping 
function is required for internal format Gi but not for 
internal format G2, as no cell in the F^ column is 
shaded in a G2 row. 

Thus, the following mapping functions are required 
in the present example: m^^i, m2,i, m2,2/ ^3,2/ 1^4,1' ^4,2/ ^1^5,1 
and mg^ 2 • 

The translation pairs t for each operation, which 




are derived directly from Fig. 6, are as follows: 

'To^ = {(1,1), (2,1), (2,2), (3,2), (4,1), (4,2), (5,1), (5,2)} 

Tor = {(1,1), (2,1), (2,2), (3,2), (4,1), (4,2), (5,1), (5,2)} 
= {(1,1), (2,1), (2,2), (3,2), (4,1), (4,2), (5,1), (5,2)} 

r„- = {(2,1), (2,2), (3,2), (4,1), (4,2)} 

r.„i, = {(l,l),(3,2),(5,l),(5,2)} 

rn, = {(3,2)} 
_rd,-v = {(3,2),(5,2)} 



. . . (eq 4) 



In step SI of the algorithm (Fig. 4(A)) the number 
of opcodes required in each foirmat is determined. For 
each external format this is determined by observing 
the number of operations for which there is at least 
one shaded cell in the column for that external format. 
For example, in the case of the external format Fi it 
can be seen that four operations (add, or, mul and sub) 
have a shaded cell in the column concerned- Where an 
external format has two columns (such as the external 
format F2) an operation is only counted once even if it 
appears in one internal format in one column and 
internal format in another column. Thus, in the case 
of the external format F2, the number of operations IF2I 
is 4 . 

In the case of an internal format the number of 
opcodes required is calculated by counting the total 
number of rows (containing at least one shaded cell) 
allocated to the internal format concerned. For 



example, the internal format G^. has five rows with 
shaded cells. The internal format G2 has seven rows 
with shaded cells. 

Thus, the numbers of opcodes required in the 
different internal and external formats are: |Gi|=5, 
|G2l=7, |F,|=4, |F2h4, iFahG, |F,|=4 and \F,\=5 . 

As a result, in step SI, the initial numbers of 
effective opcode bits are determined as ai=3, a2=3 , 
bi=2, b2=2, b3=3, b4 = 2 and b5=3 . These numbers represent 
the minimum possible numbers of bits that could 
theoretically encode the number of operations appearing 
in the format concerned, and may have to be increased 
in the course of execution of the algorithm. 

In step S2, a set of available opcodes is created 
for each external format and for each internal format, 
as shown in equation 5. 



Ri = 


{000, 


001, 


010, 


011} 










R2 = 


{000, 


001, 


010, 


011} 










R3 = 


{000, 


001, 


010, 


oil. 


100, 


101, 


110, 


111} 


R4 = 


{000, 


001, 


010, 


011} 










R5 = 


{000, 


001, 


010, 


oil. 


100, 


101, 


110, 


111} 


Qi = 


{000, 


001, 


010, 


oil. 


100, 


101, 


110, 


111} 


Q2 = 


{000, 


001, 


010, 


oil. 


100, 


101, 


110, 


111} 



. . . (eq 5) 

The working number of bits in each opcode is 
initially set to be equal to the highest required 
number of opcode bits amongst any of the internal and 
external formats, i.e. 3 opcode bits as required by the 
formats Gi, G2 and F5. The initial set Ri of opcodes for 
external format Fi is made up of four three -bit codes 
000, 001, 010 and Oil. Four codes are required as bi 
was calculated to be 2 in step SI. The same is true 
for the other two-bit external formats R2 and R4 . 

In the case of the external formats F3 and F5 eight 



codes are required and the initial codes assigned to R3 
and R5 are 000, 001, 010, Oil, 100, 101, 110 and 111. 

Each of the internal formats and G2 also 
requires eight codes (ai=3 and a2=3) so the initial sets 
Qi and Q2 of opcodes for these internal formats are also 
the same as ■ for the external formats R3 and R5. 

In step S3 a first series of iterations is 
commenced, and in this first series the first operation 
in Fig, 6, i,e, the add operation, is selected for 
initial consideration . 

In step S4, the available opcodes for the 
operation that are unused (not yet allocated) in each 
relevant pair of external and internal formats (8 pairs 
in all: Fi-Gi, F2-G1/ F4-G1, F5-G1, F2-G2, F3-G2, F4-G2, F5-G2 
in this case) are considered. Because no opcodes have 
yet been allocated, for the 5 pairs Fi-G^, F2-G1, F4-G1, 
F2-G2 and F4-G2 ht = {OOO, 001, 010, Oil} while for the 3 
pairs F5-G1, F3-G2 and F5-G2 ht = {OOO, 001, 010, Oil, 
100, 101, 110, 111}. Thus, in step S5 H={000, 001, 
010, Oil}. 

In step S6 it is checked whether H is empty. In 
this case it is not, so processing proceeds to step S7 . 
Here, the opcode c=0 0 0 is selected first from H. The 
opcode 000 therefore becomes allocated to the add 
operation . 

In step S8 the internal -format opcode sets Qi and 
Q2 are updated to remove therefrom the opcode 000, if 
contained therein. Thus, the code 000 is removed from 
each of the sets Qi and Q2. 

Also in step SB the set of available opcodes for 
each relevant external format (in this case all of the 
external formats Fi to F5) is updated to remove 
therefrom the opcode 000, if contained therein.. Thus, 
000 is removed from each of the sets Ri to R5 . 

The results of the allocations performed in the 
first iteration are shown in Fig. 7(A). In Figs, 7(A) 



to 7 (H) the opcodes remaining in the sets Q or R are 
shown, AlsO/ any opcode allocations made in the 
external and internal formats are entered in the 
relevant cells . 

Processing then returns to step S3 for the second 
iteration of this series. In the second iteration, the 
or operation is considered. The pairs to be considered 
in step S4 are the same as for the first iteration.. 
The results of steps S4 and S5 are that H=:{001, 010, 
Oil}. Thus, in step SG , H is not empty and processing 
proceeds to step S7 . In step S7 the opcode c=001 is 
selected. Accordingly, in step S8, the opcode 001 is 
removed from each of the sets Qi and Q2 of available 
opcodes for the internal formats Gi and G2 . Similarly, 
in the sets Ri to R5 for the external formats Fi to F5, 
the code 001 is removed. The results after the second 
iteration are shown in Fig. 7(B) . 

In the third iteration, the mul operation is 
considered. Again, the pairs to be considered in step 
S4 are the same as for the first and second iterations. 
In this case, the result H of the computation performed 
in step 85 is {OlO, Oil}, so that, in step S7, the 
opcode 010 is selected. In step S8 the opcode 010 is 
removed from all the sets Qi to Q2 and Ri to R5 . 

Thus, 010 becomes allocated to the mul operation. 
Fig. 7(C) shows the state reached at this time. 

In the fourth iteration of the series the li 
instruction is considered. In this case the pairs to 
be examined in step S4 are F2-G1, F4-G1, F2-G2 and F4-G2. 
In step S5 of this iteration it is determined that 
H={01l}. As the H set is not empty, processing goes on 
to step S7. Here, the code Oil is selected (it is the 
only code available in the set H) . The code Oil 
therefore becomes assigned to li. This code is removed 
from the relevant sets d, Q2, R2 and R4, but is left in 
the sets -Ri, R3 and R5. The resulting state is shown in 



Fig. 7 (D) . 

In the fifth iteration, the sub instruction is 
considered. In step S4 the set of translations T^uh = 
{<!/!>, <3,2>, <5,1>, <5,2>}. Accordingly, as the 
pairs of external and internal formats for these 
translations are Fi-Gi, F5-G1, F3-G2, F5-G2 the common 
sets ht are {} for Fi-Gi and {lOO, 101, 110, 111} for F5- 
Gi, F3-G2 and F5-G2. 

This means H={0} in step S5. This is because, 
although 100, 101, 101, 110 and 111 are still unused in 
R3/ ^5/ Qi and Q2, none of these codes is available in 
the remaining relevant set Ri which only contains Oil. 
Accordingly, processing proceeds via step S6 to step 
Sll in which the constraint is assessed. It is 
determined that the intersection between and Qi (and 
between R^ and Q2) is the empty set. Since Ri has less 
members than Qi and Q2 it can reasonably be concluded 
that Ri is the constraining factor. To overcome this 
constraint the number of effective opcode bits for F^ 
needs to be increased beyond its initial value of 2. 
Accordingly, ai is increased by one to 3 . The remaining 
values a2 to as, hi and b2 are left unchanged. 

Now, all of the existing opcode assignments are 
void and a second series of iterations is commenced at 
step 32, In this series of iterations R^ = {OOO, 001. 
010, Oil, 100, 101, 110, 111} initially. In the fifth 
iteration of this second series the sub instruction is 
again considered. At this stage the sate is shown in 
Fig. 7 (E) . 

This time, in step S5 H={l00, 101, 110, 111}. In 
step S7 the opcode 100 is selected. In step S8, 100 is 
removed from R^, R3, R5, Qi and Q2 . The resulting state 
is shown in Fig. 7(F) . 

In the sixth iteration of the second series, the 
rv instruction is considered for the first time. In 
step S5 H={101, 110, 111}. In step 37 the opcode 101 



is selected. In step SB, 101 is removed from R3 and Q2 - 
The resulting state is shown in Fig. 7(G). 

In the seventh iteration of the second series, the 
div instruction is considered for the first time. In 
step S5 H={110, 111}. In step S7 the opcode 110 is 
selected. In step S8, 100 is removed from R3, R5 and 
Q2, The resulting state is shown in Fig. 7 (H) . 

At this point all instructions have been allocated 
opcodes and the processing moves to step SIO. In this 
step the opcodes assigned so far are examined to 
determine how many bits in each external format 
actually need to be provided in the inst2ructions in the 
external format concerned. For example, in the 
external format F4 all the allocated codes 000, 001, 010 
and Oil have the prefix 0. This means that the prefix 
0 is entirely redundant is external format F4. 
Accordingly, provided that the format F4 can still be 
distinguished from all other external formats , the 
prefix 0 can be omitted from instructions in format F4 
so that only a 2 -bit opcode field is required for 
format F4 . The same is true for external format Fa- 

It follows of course that the mapping functions 
^^4,1, m4,2/ ni2,i and m2,2 must insert the 0 prefix during 
translation so that the add, or mul and li operations 
in format F4 are distinguished from the sub, rv and div 
operations in formats Fi, F3 and F5. 

This optimisation step SIO becomes particularly 
important when the number of prefix bits is greater 
than the number of bits in each instruction set needed 
to give each operation a distinct opcode in each 
external format . 

The final opcodes after optimisation are shown in 
Fig. 8. 

A method embodying the present invention can be 
implemented by a general-purpose computer operating in 
accordance with a computer program. This computer 



program may be carried by an suitable carrier medium 
such as a storage medium (e.g. floppy disk or CD Rom) 
or a signal. Such a carrier signal could be a signal 
downloaded via a communications network such as the 
Internet. The appended computer program claims are to 
be interpreted as covering a computer program by itself 
or in any of the above-mentioned forms. 

Although the above description relates, by way of 
example, to a VLIW processor it will be appreciated 
that the present invention is applicable to processors 
other than VLIW processors, A processor embodying eh 
present invention may be included as a processor "core" 
in a highly- integrated "system-on-a-chip" (SOC) for use 
in multimedia applications, network routers, video 
mobile phones, intelligent automobiles, digital 
television, voice recognition, 3D games, etc. 



CLAIMS: 

1. A processor having: 

respective first and second external instruction 
formats in which instructions are received by the 
processor, each instruction having an opcode which 
specifies an operation to be executed, and each 
external format having one or more preselected opcode 
bits in which the opcode appears; 

an internal instruction format into which 
instructions in the external formats are translated 
prior to execution of the operations; 

wherein: 

the operations include a first operation 
specifiable in both said first and second external 
formats, and a second operation specifiable in said 
second external format; 

said first and second operations have distinct 
opcodes in said second external format; and 

in each said preselected opcode bit which the 
first and second external formats have in common, the 
opcodes of the first operation in the two external 
formats are identical . 

2. A processor as claimed in claim 1, wherein: 
the operations include one or more further first 

operations, each specifiable in both said first and 
second external formats, and one or more further second 
operations specifiable in said second external format; 

for evezY pair of operations, made up of one said 
first operation and one said second operation, the 
operations of the pair have distinct opcodes in said 
second external format; and 

in each said preselected opcode bit which the 
first and second external formats have in common, the 
opcodes of each first operation in the two external 
formats are identical. 



3, A processor as claimed in claim 1 or 2, 
having : 

a third external instruction format in which 
instructions are received by the processor, each 
instruction having an opcode which specifies an 
operation to be executed, and said third external 
format having one or more preselected opcode bits in 
which the opcode appears; 

respective first and second internal instruction 
formats into which instructions in the external formats 
are translated prior to execution of the operations; 

wherein: 

said second operation is specifiable in both said 
second and third external formats; 

an instruction specifying said first operation in 
either said first or second external format is 
translated into said first internal format, and an 
instruction specifying said second operation in either 
said second or third external format is translated into 
said second internal format; and 

in each said preselected opcode bit which the 
second and third external formats have in common, the 
opcodes of the second operation in the two external 
formats are identical. 

4. A processor as claimed in claim 3, wherein: 
the operations include one or more further first 

operations, each specifiable in both said first and 
second external formats, and one or more further second 
operations specifiable in said second external format; 

for every pair of operations, made up of one said 
first operation and one said second operation, the 
operations of the pair have distinct opcodes in said 
second external format; 

in each said preselected opcode bit which the 
first and second external formats have in common, the 
opcodes of each first operation in the two external 
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formats are identical; and 

in each said preselected opcode bit which the 
second and third external formats have in common, the 
opcodes of each first operation in the two external 
5 formats are identical. 

5. A processor as claimed in any preceding 
claim, being a VLIW processor, wherein one external 
format is a scalar instruction format used for scalar 
instructions, and another external format is a VLIW 

10 instruction format used for VLIW instructions. 

6. A processor as claimed in any preceding 
claim, being a VLIW processor, wherein the external 
formats are or include two different VLIW formats. 

7. A processor as claimed in claim 6, wherein 
the two different VLIW formats are used in different 
respective instruction slots of a VLIW instruction 
parcel . 

8. A processor as claimed in claim 6 or 7, 
wherein at least one instruction slot of a VLIW 
instruction parcel uses the two different VLIW formats. 

9. A processor as claimed in any preceding 
claim, wherein one external format has an instruction 
width different from that of another external format. 

10. A processor as claimed in any preceding 

25 claim, having: 

translation means operable to perform a 
predetermined translation operation for translating 
each said external -format opcode into a corresponding 
internal -format opcode. 
30 11. A processor as claimed in claim 10, wherein 

said translation operation involves selecting and/or 
permuting bits amongst the said preselected opcode bits 
in the external -format instruction. 

12. A processor as claimed in claim 10 or 11, 
35 wherein the translation operation is independent of the 

external -format opcode. 
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13. A processor as claimed in claim 12, wherein 
the translation means are operable to identify the 
internal format into which each external -format 
instruction is to be translated, and to carry out the 

5 said translation operation according to the identified 

internal format . 

14. Processor instruction encodings having: 
respective first and second external instruction 

formats in which the instructions are received by a 
10 processor, each instruction having an opcode which 

specifies an operation to be executed, and each 
external format having one or more preselected opcode 
bits in which the opcode appears; 

an internal instruction format into which the 
15 processor instructions in the external formats are 

translated prior to execution of the operations; 
wherein: 

a first operation executable by the processor is 
specifiable in both said first and second external 

2 0 formats, and a second operation executable by the 

processor is specifiable in said second external 
format ; 

said. first and second operations have distinct 
opcodes in said second external format; and 
f' 25 in each said preselected opcode bit which the 

first and second external formats have in common, the 
opcodes of the first operation in the two external 
formats are identical- 

15 . A method of encoding processor instructions 

3 0 for a processor having respective first and second 

external instruction formats in which instructions are 
received by the processor, each instruction having an 
opcode which specifies an operation to be executed, and 
each external format having one or more preselected 
3 5 opcode bits in which the opcode appears, the processor 

also having an internal instruction format into which 



instructions in the external formats are translated 
prior to execution of the operations, and the 
operations include , a first operation specifiable in 
both said first and second external formats, and a 
second operation specifiable in said second external 
format, said method comprising the steps of: 

encoding said first and second operations with 
distinct opcodes in said second external format; and 

encoding the opcodes of the first operation in 
said first and second external formats so that, in each 
said preselected opcode bit which the first and second 
external formats have in common, the opcodes of the 
first operation in the two external formats are 
identical . 

16. A method of encoding instructions for a 
processor having two or more external instruction 
formats and one or more internal instruction formats, 
the method comprising: 

(a) selecting initial encoding parameters 
including a number of effective opcode bits in each 
external and internal format and a set of mapping 
functions, each said mapping function serving to 
translate an opcode specified by the said opcode bits 
in one of the external, formats to an opcode specified 
by the said opcode bits in the, or in one of the, 
internal formats ; 

(b) allocating each operation executable by the 
processor an opcode distinct from that allocated to 
each other operation in each external and internal 
format in which the operation is specifiable, the 
allocated opcodes being such that each relevant mapping 
function translates such an external -format opcode 
allocated to the operation into such an internal -format 
opcode allocated to the operation and such that all the 
internal -format opcodes allocated to the operation have 
the same effective opcode bits; and 



(c) if in step (b) no opcode is available for 
allocation in each specifiable format for 
every one of the said operations, determining which of 
the said encoding parameters is constraining the 
allocation in step (b) , relaxing the constraining 
parameter, and then repeating step (b) . 

17. A method as claimed in claim 16, wherein each 
said mapping function involves selecting all bits of 
the external -format opcode as some or all of the bits 
of the internal -format opcode. 

18. A method as claimed in claim 16 or 17, 
wherein in step (a) , for each external and internal 
format, the said number of effective opcode bits is 
made equal to a minimum possible number of opcode bits 
that could theoretically encode the number of 
operations specifiable in the format concerned. 

19. A method as claimed in any one of claims 16 
to 18, wherein step (b) comprises a series of 
iterations, and prior to commencing the series of 
iterations a set of available opcodes in each external 
and internal format is formed, and in each iteration of 
the series one saiid operation is considered and the 
allocation of the opcode to the considered operation is 
made based on an examination of the sets of available 
opcodes in each external and internal format in which 
the considered operation is specifiable. 

20. A method as claimed in claim 19, wherein, for 
each said external and internal format, the set of 
available opcodes formed prior to commencing a series 
of iterations has a number of members dependent upon 
the said number of effective opcode bits currently 
applicable to that f onnat . 

21. A method as claimed in claim 19 or 20, 
wherein the available opcodes in all the sets have the 
same working number of bits. 

22. A method as claimed in claim 21, wherein the 



said working number is set equal to a minimum possible 
number of opcode bits that could theoretically encode 
the number of operations specifiable in the external or 
internal format having the highest number of operations 
specifiable in the format concerned. 

23 . A method as claimed in any one of claims 19 
to 22, wherein each said iteration of step (b) 
comprises : 

(b-1) determining which, if any, available opcodes 
are common to the sets for all the external and 
internal formats in which the considered operation is 
specifiable; and 

{b-2) if it is determined in step (b-1) that one 
or more such available opcodes are common, selecting 
the or one of the common opcodes, allocating it to the 
considered operation, and removing the selected opcode 
from the set for each external and internal format in 
which the considered operation is specifiable. 

24. A method as claimed in claim 23 wherein each 
said iteration of step (b) further comprises: 

(b-3) if it is determined in step (b-1) that no 
common available opcode is present in the sets for all 
the external and internal formats in which the 
considered operation is specifiable, making all 
existing allocated opcodes void and carrying out step 
(c) . 

25. A method as claimed in any one of claims 16 
to 24, further comprising: 

(d) after all of the operations have been 
allocated one of the said available opcodes having the 
said working number of bits, determining for each 
external format whether that working number is greater 
than a minimum number of bits needed to provide each 
operation specifiable in that external format with its 
own distinct opcode and, if so, restricting the 
allocated opcodes in that external format to the 



determined minimum number of bits. 

26. A method as claimed in claim 25, wherein step 
(d) comprises: 

(d-1) identifying for each external format a 
maximum- length common prefix, if any, for all allocated 
opcodes in the external format concerned; and 

(d-2) removing the identified common prefix from 
all the allocated opcodes in the external format 
concerned; and 

(d-3) adjusting each mapping function that serves 
to translate an opcode specified by the opcode bits in 
the external format concerned into an opcode specified 
by internal -format opcode bits so that the mapping 
function prepends the identified common prefix to the 
external -format opcode bits during translation. 

27- A method as claimed in any one of claims 16 
to 26, wherein in step (c) if it is determined that the 
number of effective opcode bits in one of the external 
or internal formats is the constraining parameter, the 
number of effective opcode bits in that format is 
increased . 

28, A method as claimed in any one of claims 16 
to 27, carried out by electronic data processing means. 

29. A computer program which, when executed, 
encodes instructions for a processor having two or more 
external instruction formats and one or more internal 
instruction formats, the computer program comprising 
code portions for: 

(a) selecting initial encoding parameters 
including a number of effective opcode bits in each 
external and internal format and a set of mapping 
functions, each said mapping function serving to 
translate an opcode specified by the said opcode bits 
in one of the external formats to an opcode specified 
by the said opcode bits in the, or in one of the, 
internal formats; 
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(b) allocating each operation executable by the 
processor an opcode distinct from that allocated to 
each other operation in each external and internal 
format in which the operation is specifiable, the 
allocated opcodes being such that each relevant mapping 
function translates such an external -format opcode 
allocated to the operation into such an internal -format 
opcode allocated to the operation and such that all the 
internal -format opcodes allocated to the operation have 
the same effective opcode bits; and 

(c) if in step (b) no opcode is available for 
allocation in each specifiable format for 

every one of the said operations, determining which of 
the said encoding parameters is constraining the 
allocation in step (b) , relaxing the constraining 
parameter, and then repeating step (b) . 

30. A computer program which, when run on a 
computer, causes the computer to carry out the encoding 
method of any one of claims 16 to 28. 

31. A computer program as claimed in claim 29 or 
30, carried by a carrier medium. 

32. A computer program as claimed in claim 31, 
wherein the said carrier medium is a storage medium. 

33. A computer program as claimed in claim 31, 
wherein the said carrier medium is a signal . 

34. A processor substantially as hereinbefore 
described with reference to any of Figs. 2 to 8 except 
3(A) of the accompanying drawings. 

35. Processor instruction encodings substantially 
as hereinbefore described with reference to Figs. 2 to 

8 except 3 (A) of the accompanying drawings . 

36. A method of encoding processor instructions 
substantially as hereinbefore described with reference 
to the accompanying drawings . 

37. A computer program which, when run on a 
computer, causes the computer to carry out an encoding 
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method substantially as hereinbefore described with 
reference to the accompanying drawings. 
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ABSTRACT 
INSTRUCTION SETS FOR PROCESSORS 



A processor has respective first and second 
external instruction formats {F^, F2) in which 
instructions (add, load) are received by the processor. 
Each instruction has an opcode (e.g. 1011) which 
specifies an operation to be executed. Each external 
format has one or more preselected opcode bits (Fi: 
1+1-1+4; F2: 1+1-1+3) in which the opcode appears. The 
processor also has an internal instruction format (Gi) 
into which instructions in the external formats are 
translated prior to execution of the operation. 

A first operation (add) is specifiable in both the 
first and second external formats (Fi, F2) / and a second 
operation (load) is specifiable in the second external 
format (F2) . The first and second operations have 
distinct opcodes (101, Oil) in the second external 
format. In each of the preselected opcode bits which 
the first and second external formats have in common 
(1+1-1+3) , the opcodes of the first operation (101) in 
the two external formats are identical. 

Such "congruent" instruction encodings can enable 
a translation process, for translating the external - 
format opcode into a corresponding internal -format 
opcode, to be carried out simply and quickly without 
the need to positively identify each individual 
external -format opcode . 



[Fig. 3(B)] 




HOT TO EE AMIXNOED 




fo 



9c 



3S? 



isr 



J 



2^9 \ \ 2Jto 



7" 

2^ 




2!^ 

) 

' ^( 



I: 



mi TO EE AMENDED 



3/ 



add 

add 

load 
load 







110 1 






F2 




1 0 1 






F2 




1 1 0 












► 



F3 




10 11 





► ^1 



t+3 



add 

add 

load 
load 



Fi 




10 11 






F2 




1 0 1 






F2 




0 1 1 




► 










F3 




1011 





SOT TG aii? AMENDED 



- Fur all ;6 [l,N] Icl Oj 




For all is (l,Af | let 6,- 


= [log2(\f■J\)^ 



^Sl 



S2 



For all je [UN] lei = |0, 2"0 
For all is [l^Tl let i?, = |0, 2*') 



I 



For all e W do steps ?4 -fo 



i:5> 



S4 



For all te T , where r = </. j) . compute /i, = R,nQj 



I 



Compute = /i, 




Yes 



r 



Determine whether the translation is constrained by 

the number of external 
opcode biC3 or the number of internal opcode bits, lncrca.se 
whichever is the limiting factor and go back to S2 to re-com- 
pute the opcode assignments. 



F/C. 4(t^) 



SI I fc 



S7 



Select any ce H and let c be iljc opcode for operation y 



i 

For all reT^, where / = < (\ j) , let 
= G;- tc) and let /?, = /?,-.tc} 




For ull 1 e ll,Af) determine the minimum number oKbiLi in 
each external format and restrict the opcode field width to 
that number of biis in I'ormat . This is achieved by identi- 
fying the maximum-lcngih common prefix far all opcodes in 
Fi , removing that prefix from all opcodes in F,- and then 
prcpcnding that prefix To all opcodes during translation. 



6// 



VLIW formats 




x.»V»x*.*.i:J.formats 



m± TC BE AMENDED 



-l/(b 





luriTlal 


External formats 


External formats (VLIW) 


operation 


(Scalar) 


Issue slot A 


Issue slot B 
















add 
















G2 














or 
















G, 














mul 






























li 


Gi 














G2 














sub 


G, 
















G2 














rv 


G2 














div 


G2 















XQ B,^ A.\f Fivjnirn 



Operation 


Internal 
formats 


External 
(Sea 


formats 


External formats (VUW) 


lar) 


laauo slot A 


l3dUQ 1 








c 
^ \ 








add 




000 


000 


OUU 


nnn 






Oi 


000 


000 






AAA 

000 


f^n^ 
































WCIX 1. 


Ci 





























1 ; 
i. J. 


c. 






























C^ 














02 














rv 
















div 
















{001,010,011} 
= {001,010,011} 

= {001.010,011, 100, lOUllO, 111} 
{001,010,011} 
/?5 =z {001.010,011, 100, 101, 110, 111} 
(2, r= {001,010,011. 100, lOU 110, 111} 
2^= {001,010.011, 100,101. 110, 111} 



aGT TO BE AMEMDEn 



Operation 


Internal 
formate 


External formato 
(Scalar) 


Exlernal formats (VLIW) 


loaue slot A 


IsauH slot B 














add 




000 


000 


000 


000 






Oz 


000 


000 






000 


000 


or 


Ci 


001 


001 


001 


001 








001 


001 






001 


001 


mul 


Oi 














Oi 














li 
















Oz 














sub 


0^ 














O2 














rv 


Cz 














div 
















i?i = {010,011} 
= {010.011} 
;?3 = {010,011, 100. 101, 110, 111} 
/?4 = {010,011} 

;?5= {010,011, 100. 101. 110, 111} 

e, = {010.011, 100. 101. 110, 111} 

fi2 = {010,011, 100, 101. 110, 111} 



iol 



operation 


Inlcrnal 
formats 


External formats 
- (Scalar) 


External formala (VLIW) 


Issue 3lot A 


Issue slot B 






rr 




' 2 


3 






000 


OQO 


UUU 


UUU 






c. 


000 


000 






nnn 

UUU 


nnn 


OX" 


. ■ 


001 


001 


001 


□01 








001 


001 






UU 1 


UU 1 


mux 


G, 


010 


010 


010 


010 








010 


010 






U 1 U 


U 1 u 


1 1 






























sub 
















Gj 














rv 
















div 
















;?, = {Oii} 

/e3 = {Oil, 100, 101, no, 111} 

/?^={011} 

/?5 = toil, 100, 101,110,111} 
={011,100, 101. 110,111} 
= {Oil, 100, 101. 110, 111} 



u( (6 



Operation 


Intsrnai 
formfltd 


External formiats 
(Scalar) 


External formats (VLIW) 


l93Ue 9lOt A 


Issue slot B 














1 aaa 




000 


000 


000 


000 








000 


000 






000 


000 


or 


g\ 


001 


001 


001 


001 








001 


001 






001 


001 


xnul 




010 


010 


010 


010 






Cy 


010 


010 






010 


010 




Oi 


oil 






oil 






O2 


oil 








011 




aub 


Cy 






























rv 
















div 
















= {011} 

^2 = { } 

^.i = {01i, 100, 101,110, 111} 

/?4 = { } 

/?3 = {011,100, 101,110. 111} 
2] = {100, 101, 110. 111} 
02= {100, 101, no. 111} 



mr TO BE AMENDED 



r 



a.( 14 











Externa! formats (VLIW) 


Operation 


Internal 


(Scalar) 


Isaue slot A 


Issue slot B 


formats 


Pi. 




p 




-* 2 








000 


uuu 


uuu 


non 

U wU 










000 


000 






000 

W W V 


000 




Or 


001 


OOl 




nAi 










001 


001 






UU 1 


001 






010 


010 


01 U 


ni n 

U lU 






rrtul 




010 


010 






w 1 U 


010 




Gx 


oil 






m i 

U 1 1 






J. X 


C, 


oil 








011 




sub 


0, 




























rv 
















div 


O-L 















/?j « {Oil, 100, 101. 110. ill} 

i?2 = { } 

i?, = {011,100.101. 110, 111} 

^. = { } 

«5 = {011,1CO, 101,110,111} 

Qi = {100. 101, no. Ill) 
Qj = {100, 101, no, ill) 



^DT TO BE 



AMENDED 



i3) ( fc. 



Operation 


Internal 
form^its 


External formats 
(Scalar) 


External formats (VLIW) 


l33Ua slot A 


Issue slot B 


F, 










^3 


add 


0, 


000 


000 


000 


000 






Oi 


000 


000 






000 


000 


or 


Gy 


001 


001 


001 


001 






G, 


001 


001 






001 


001 


mul 


Oi 


010 


010 


010 


010 






Oz 


010 


010 






010 


010 


li 


Gi 


oil 






011 








oil 








oil 




sub 






100 


100 








Oz 




100 








100 


rv 
















div 


Gz 














/?i = {Oil, 101. 1 10, HI) 
/?2 = { } 

/?, =={011.101,110, 111) 

= ( } 

= {Oil, 101, I 10, HI) 
= (101, 110, 111} 
Q^={101.110, 111} 



Operation 


inierndi 


Eiiternai 
(Bca 


formate 


External formats (VLIW) 


lar) 


Issue slot A 


Issue ! 


ilotB 


formata 


ji 


^ 5 










add 




UvlU 


000 


000 


000 








AHA 


UUv 






000 


000 


or 




UU 1 


001 


001 


001 








UU 1 


001 






001 


001 


mul 




ni rt 

W 1 \J 


010 


010 


010 








U 1 u 


01 0 

U 1 u 






010 


010 


li 




u 1 1 






011 








U I 1 








oil 




sub 






1 00 
1 uu 


100 












100 








100 


rv 














101 


div 
















= {Oil. 101. 110, 1 11} 

/e3 = {Oil, 110, 111) 

= {Oil. 101. no, 111) 

(2i = 1101, 110,111} 
(22 = {110,111} 



mXX TO BE AME.\'DEa 



Operation 


Internal 
formats 


External format* 
(Scalar) 


External formats (VLIW) 


Isiiua slot A 


Issue slot B 














add 


Ci 


000 


000 


000 


000 






Gj 


000 


000 






000 


000 


or 


G, 


001 


001 


001 


001 








001 


001 






001 


001 


mul 


G, 
^ 1 


010 


010 


010 


010 








010 


010 






010 


010 


li 


1 


oil 






oil 






G-, 


oil 








oil 




sub 


G, 




100 


100 












100 








100 


rv 


G2 












101 


div 


Gz 




110 








110 


y?i = {Oil, 101, 110, 111} 

/?3 = {011, 111} 

= 

7?5= {Oil, 101, 111} 

G, = {101, 110, 111} 
(22 = {iii} 



f(c. iCh) 









Externa! formats (VUW) 


Operation 


Internal 


External formats . 
(Scalar) 


Issue slot A 


I33UO slot 5 1 


formats 






c 

^ I 






J 






00 


OQO 


AAA 

000 








add 




00 


OOO 






00 


000 1 






01 


001 


001 


A^ 

0 1 






or 




01 


001 






01 


001 1 




C^ 


10 


010 


010 


4 A 
1 □ 






mul 


0^ 


10 
11 


010 






10 


010 1 


li 


Ci 
«i 


11 


100 


100 




11 




sub 
rv 






100 








100 
101 


div 


C2 




110 








110 



/?, = ton. 101. 110, 111} 

= { } 
i?3 = {0ll,ill> 

i?5 = (Oil. 101, ill} 
(2, = {101,110, 1I1> 

(22 = {in} 



